PodCastle: A Web 2.0 Approach to Speech Recognition Research (INTERSPEECH 2007)

نویسندگان

  • Masataka Goto
  • Jun Ogata
  • Kouichirou Eto
چکیده

In this paper, we describe a public web service, “PodCastle”, that provides full-text searching of Japanese podcasts on the basis of automatic speech recognition. This is an instance of our research approach, “Speech Recognition Research 2.0”, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech recognition performance, and at promoting speech recognition technologies in cooperation with anonymous users. PodCastle enables users to find podcasts that include a search term, read full texts of their recognition results, and easily correct recognition errors. The results of the error correction can then be used to improve the performance of both full-text search and speech recognition. Although we know of no state-of-the-art speech recognizer that can successfully transcribe all of the various kinds of podcasts, the mechanism we propose will gradually increase the usefulness and applicability of PodCastle.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Podcastle: a web 2.0 approach to speech recognition research

In this paper, we describe a public web service, “PodCastle”, that provides full-text searching of Japanese podcasts on the basis of automatic speech recognition. This is an instance of our research approach, “Speech Recognition Research 2.0”, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech recognition performance, and at...

متن کامل

PodCastle: A Spoken Document Retrieval Service Improved by Anonymous User Contributions

In this invited paper, we introduce a public web service, PodCastle, that provides full-text searching of speech data (Japanese podcasts) on the basis of automatic speech recognition technologies. This is an instance of our research approach, Speech Recognition Research 2.0, which is aimed at providing users with a web service based on Web 2.0 so that they can experience state-of-the-art speech...

متن کامل

Automatic Transcription for a Web 2.0 Service to Search Podcasts (INTERSPEECH 2007)

This paper describes speech recognition techniques that enable a Web 2.0 service “PodCastle” where users can search and read transcribed texts of podcasts, and correct recognition errors in those texts. Most previous speech recognizers had difficulties transcribing podcasts because podcasts include various kinds of contents recorded in different conditions and cover recent topics that tend to h...

متن کامل

PodCastle: Recent Advances of a Spoken Document Retrieval Service Improved by Anonymous User Contributions

In this paper, we introduce recent advances of a speech retrieval web service, PodCastle, that collects and amplifies voluntary contributions by anonymous users. Our goal is to provide users with a public web service based on speech recognition and crowdsourcing so that they can experience state-of-the-art speech recognition performance through a useful service. PodCastle enables users to find ...

متن کامل

Podcastle: collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription

This paper presents acoustic-model-training techniques for improving automatic transcription of podcasts. A typical approach for acoustic modeling is to create a task-specific corpus including hundreds (or even thousands) of hours of speech data and their accurate transcriptions. This approach, however, is impractical in podcast-transcription task because manual generation of the transcriptions...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007